Questions Are All You Need to Train a Dense Passage Retriever
نویسندگان
چکیده
Abstract We introduce ART, a new corpus-level autoencoding approach for training dense retrieval models that does not require any labeled data. Dense is central challenge open-domain tasks, such as Open QA, where state-of-the-art methods typically large supervised datasets with custom hard-negative mining and denoising of positive examples. in contrast, only requires access to unpaired inputs outputs (e.g., questions potential answer passages). It uses passage-retrieval scheme, (1) an input question used retrieve set evidence passages, (2) the passages are then compute probability reconstructing original question. Training based on reconstruction enables effective unsupervised learning both passage encoders, which can be later incorporated into complete QA systems without further finetuning. Extensive experiments demonstrate ART obtains results multiple benchmarks generic initialization from pre-trained language model, removing need data task-specific losses.1 Our code model checkpoints available at: https://github.com/DevSinghSachan/art.
منابع مشابه
Linear Regions Are All You Need
The type-and-effects system of the Tofte-Talpin region calculus makes it possible to safely reclaim objects without a garbage collector. However, it requires that regions have last-in-first-out (LIFO) lifetimes following the block structure of the language. We introduce λ, a core calculus that is powerful enough to encode Tofte-Talpin-like languages, and that eliminates the LIFO restriction. Th...
متن کاملAll You Need Is Compassion
The paper presents a new deductive rule for verifying response properties under the assumption of compassion (strong fairness) requirements. It improves on previous rules in that the premises of the new rule are all first order. We prove that the rule is sound, and present a constructive completeness proof for the case of finite-state systems. For the general case, we present a sketch of a rela...
متن کاملCNN Is All You Need
CNNs have been successfully used in audio, image and text classification, analysis and generation [12,17,18], whereas the RNNs with LSTM cells [5,6] have been widely adopted for solving sequence transduction problems such as language modeling and machine translation [19,3,5]. The RNN models typically align the element positions of the input and output sequences to steps in computation time for ...
متن کاملAll You Need Is Mentorship
I find it humbling to confess that most of the truly original ideas that have driven my research group’s agenda over four decades of time have come, not from my own brain, but instead from the minds of my trainees, both graduate students and post-docs. This on its own might explain why I, rather selfishly, have given them long leashes, allowing them to strike out on their own and craft their ow...
متن کاملDomain-Specific Mashups: From All to All You Need
Last years, aside the proliferation of Web 2.0, we assisted to the drastic growth of the mashup market. An increasing number of different mashup solutions and platforms emerged, some focusing on data integration (a la Yahoo! Pipes), others on user interface (UI) integration and some trying to integrate both UI and data. Most of proposed solutions have a common characteristic: they aim at provid...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Transactions of the Association for Computational Linguistics
سال: 2023
ISSN: ['2307-387X']
DOI: https://doi.org/10.1162/tacl_a_00564